{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "# Indices and Keys\n", "Indices are structures created to optimize access to data. They are used to retrieve data from a relational database in an efficient way. Indices speed up data retrieval in search and join operations in tables with many records, reducing the number of records that the engine reads to fetch the result.\n", "\n", "Keys are another concept somewhat related to indices. Keys special fields used to identify records within a table, they are a logical structure in our model primarily used to define relationships, whereas indices control the way the information is stored and retrieved in the database.\n", "\n", "On the other hand, the performance of write operations (insert and specially update) is penalised, so as a rule of thumb, you should use indices on columns that will be frequently used in where or join statements. \n", "\n", "## Primary Key\n", "The primary key is a field or combination of fields that uniquely identifies a row in a table. There can only be one primary key in a table. The primary key values must be unique and Not nullable.\n", "\n", "## Syntax in create statement\n", "\n", "You can use the ```CREATE``` statement to define the Primary Key and the indices in your table. Let us see it with an example:\n", "\n", "```mysql\n", "CREATE TABLE Person (\n", " PersonId INT NOT NULL AUTO_INCREMENT, -- Normal Integer. At most 4294967296 different PersonIds. \n", " Name VARCHAR(30) DEFAULT '', -- Name with at most 30 characters. Default empty string.\n", " IsAwesome TINYINT(1) DEFAULT 0, -- Either 0 (False) or 1 (True). Default 0 \n", " AwesomenessLevel DECIMAL(13,3) DEFAULT 10.0, -- Decimal value with 13 digits of precision, up to 10^10 and with three digits after the decimal point. Defaults to 10.0\n", " Gender ENUM('Male', 'Female', 'Non-Binary') DEFAULT 'Male', -- Either male, female, or non binary. Default 'Male'\n", " Created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, -- timestamp with the date of creation, defaults to current timestamp\n", " PRIMARY KEY (PersonId),\n", " INDEX current_timestamp_idx (Created)\n", " );\n", "```\n", "\n", "In the example, we have used two statements in the *PersonID* column to ensure that it is a valid identifier: ```NOT NULL``` to make it non-nullable (no row with null PersonId is ever allowed), and ```AUTOINCREMENT``` to auto increment the value of the identifier automatically (This is a MySQL specific feature).\n", "\n", "Below, we have used the statement ```PRIMARY KEY (PersonId)``` to make it the primary key. \n", "\n", "The statement ```INDEX current_timestamp_idx (Created)``` creates a new index in the column Created. This will speed up search statements that use this column in the WHERE statement. \n", "\n", " \n", "In database engines like MySQL ```KEY``` is a synonym for ```INDEX```. In others like Oracle, they are not equivalent. \n", "\n", " \n", " **IMPORTANT**\n", " Since the PersonId is now auto incremented by the engine, you should not include it in the field list in INSERT statements, as the engine will assign a value automatically.\n", " \n", "## Primary Key with Alter Table statement\n", " You can also add a primary key to a table using the ```ALTER``` statement:\n", " \n", " ```sql\n", "ALTER TABLE Person\n", " ADD PRIMARY KEY (PersonId);\n", "```\n", "\n", "If you want to name the primary key or use several fields as primary key use:\n", "\n", "```sql\n", "ALTER TABLE Person\n", " ADD CONSTRAINT PersonId_PK PRIMARY KEY (PersonId);\n", "```\n", "\n", "**IMPORTANT**\n", "If you use ALTER TABLE to add a primary key, the primary key column(s) must have been declared to not contain NULL values (when the table was first created). \n", "## Index Syntax with Create Index statement\n", " You can also create an index on an existing table using the ```CREATE INDEX``` statement\n", " \n", "```mysql\n", "-- Create an index named gender_isawesome_idx on Person\n", "CREATE INDEX gender_isawesome_idx\n", "ON Person (Gender, IsAwesome);\n", "```\n", "\n", "## Shorthand in MySQL CREATE TABLE statement\n", "The following syntax is also valid in MySQL:\n", "```mysql\n", "CREATE TABLE Person (\n", " PersonId INT NOT NULL AUTO_INCREMENT PRIMARY KEY, -- Normal Integer. At most 4294967296 different PersonIds.\n", " Name VARCHAR(30) DEFAULT '', -- Name with at most 30 characters. Default empty string.\n", " IsAwesome TINYINT(1) DEFAULT 0, -- Either 0 (False) or 1 (True). Default 0 \n", " AwesomenessLevel DECIMAL(13,3) DEFAULT 12.3, -- Decimal value with 13 digits of precision, up to 10^10 and with three digits after the decimal point. Defaults to 10.3\n", " Gender ENUM('Male', 'Female', 'Non-Binary') DEFAULT 'Male', -- Either male, female, or non binary. Default 'Male'\n", " Created TIMESTAMP DEFAULT CURRENT_TIMESTAMP, -- timestamp with the date of creation, defaults to current timestamp\n", " INDEX current_timestamp_idx (Created)\n", " );\n", "```\n", "\n", "By adding the PRIMARY KEY keywords in the definition of the PersonId field, MySQL adds the PRIMARY KEY constraint, Not nullabe condition and uniqueness restriction and in this sense it is a shorthand to define the Primary key. \n", "\n", "\n", "## Drop Primary Key\n", "To drop a primary key use:\n", "\n", "```mysql\n", "ALTER TABLE Person\n", " DROP PRIMARY KEY;\n", "```\n", "\n", "If you used a named constraint to create the primary key use instead:\n", "\n", "```sql\n", "ALTER TABLE Person\n", " DROP CONSTRAINT PersonId_PK;\n", "```\n", "\n", "## Foreign Keys\n", "Foreign keys are used to explicitly define in the model the relations in our data. Foreign keys will prevent users from creating inconsistencies when deleting data.\n", "Foreign keys also make join operations more efficient since the engine has more context information to optimise internal data search operations.\n", "\n", "Let us create the following table Order:\n", "\n", "```mysql\n", "CREATE TABLE Order (\n", " OrderId INT NOT NULL AUTO_INCREMENT,\n", " OrderNumber int NOT NULL,\n", " Person_fk int,\n", " \n", " PRIMARY KEY (OrderID),\n", " CONSTRAINT Order_Person_fk FOREIGN KEY (Person_fk) REFERENCES Person(PersonId)\n", ");\n", "```\n", "The column ```Person_fk``` stores the relation (1:N) between Person and Order, specifying which person placed each order. It references the primary key of the \"Person\" table, \"PersonId\". \n", "This relationship is specified using the ```FOREIGN KEY``` statement in the definition of a constraint.\n", "In this data model, if users try to delete a product referenced by an order in the Order table, the database engine will raise an error to prevent inconsistent data in the order table.\n", "\n", "You can also use the ```ALTER TABLE``` statement to add foreign key constraints:\n", "\n", "```sql\n", "ALTER TABLE Order\n", " ADD CONSTRAINT Order_Person_fk FOREIGN KEY (Person_fk) REFERENCES Person(PersonId);\n", "```\n", "\n", "## Drop a Foreign key\n", "You can use the ```DROP CONSTRAINT``` statement to drop a foreign key constraint:\n", "```sql\n", "ALTER TABLE Order\n", " DROP CONSTRAINT Order_Person_fk;\n", "```\n", "# Join clauses\n", "As explained in the [Introduction to Databases](./Introduction%20to%20Databases.ipynb), relational databases store data in different tables. In SQL, to retrieve related data from different tables, we use join clauses in the ```SELECT``` statement. Hereby we will explain the most common join clauses in SQL.\n", "\n", "## Inner join\n", "The ```INNER JOIN``` clause selects records in both tables that meet the ```ON``` condition and the conditions in the ```WHERE``` clause, if present.\n", "\n", "The syntax of the Inner join is:\n", "\n", "```sql\n", "SELECT column_name(s)\n", "FROM table1\n", "INNER JOIN table2\n", "ON table1.column_name = table2.column_name;\n", "```\n", "\n", "For example, the clause, \n", "\n", "```sql\n", "SELECT p.Name, o.OrderNumber\n", "FROM Person p\n", "INNER JOIN Order o\n", "ON p.PersonId = o.Person_fk\n", "```\n", "Will return the names of all persons that placed an order together with the corresponding order numbers. If any person in the table did not place any order, it will not be present in the result.\n", "\n", "## Left join\n", "The ```LEFT JOIN``` clause selects all records from the left table that meets the criteria in the ```WHERE``` clause, if present, together with the selected fields from the right table.\n", "\n", "The syntax of the Left join is:\n", "\n", "```sql\n", "SELECT column_name(s)\n", "FROM table1\n", "LEFT JOIN table2\n", "ON table1.column_name = table2.column_name;\n", "```\n", "\n", "For example, the clause, \n", "\n", "```sql\n", "SELECT p.Name, o.OrderNumber\n", "FROM Person p\n", "LEFT JOIN Order o\n", "ON p.PersonId = o.Person_fk\n", "```\n", "Will return the names of all persons, together with the corresponding order numbers. If any person in the table placed more than one order, is name will appear repeated the number of order he or she placed. \n", "\n", "## Right join\n", "The ```right JOIN``` clause selects all records from the right table that meets the criteria in the ```WHERE``` clause, if present, together with the selected fields from the left table.\n", "\n", "The syntax of the Right join is:\n", "\n", "```sql\n", "SELECT column_name(s)\n", "FROM table1\n", "RIGHT JOIN table2\n", "ON table1.column_name = table2.column_name;\n", "```\n", "\n", "For example, the clause, \n", "\n", "```sql\n", "SELECT p.Name, o.OrderNumber\n", "FROM Order o\n", "RIGHT JOIN Person p\n", "ON p.PersonId = o.Person_fk\n", "```\n", "\n", "will have the same result as the previous example with ```LEFT JOIN```. \n", "\n", "## Outer join\n", "The ```Outer JOIN``` clause selects all records from the left table and all records from the right table that meet the criteria in the ```WHERE``` clause, if present.\n", "\n", "The syntax of the Outer join is:\n", "\n", "```sql\n", "SELECT column_name(s)\n", "FROM table1\n", "FULL OUTER JOIN table2\n", "ON table1.column_name = table2.column_name;\n", "```\n", "\n", "For example, the clause, \n", "\n", "```sql\n", "SELECT p.Name, o.OrderNumber\n", "FROM Person p\n", "FULL OUTER JOIN Order o\n", "ON p.PersonId = o.Person_fk\n", "```\n", "Will return the names of all persons, together with the corresponding orders they placed, if by any chance there were orders without a foreign key to Person (for instance self placed orders), they would appear too. \n", "\n", "\n", "\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.6" }, "pycharm": { "stem_cell": { "cell_type": "raw", "source": [], "metadata": { "collapsed": false } } } }, "nbformat": 4, "nbformat_minor": 0 }